Towards Full Text Shallow Discourse Relation Annotation: Experiments with Cross-Paragraph Implicit Relations in the PDTB

نویسندگان

  • Rashmi Prasad
  • Katherine Forbes-Riley
  • Alan Lee
چکیده

Full text discourse parsing relies on texts comprehensively annotated with discourse relations. To this end, we address a significant gap in the inter-sentential discourse relations annotated in the Penn Discourse Treebank (PDTB), namely the class of cross-paragraph implicit relations, which account for 30% of inter-sentential relations in the corpus. We present our annotation study to explore the incidence rate of adjacent vs. non-adjacent implicit relations in cross-paragraph contexts, and the relative degree of difficulty in annotating them. Our experiments show a high incidence of non-adjacent relations that are difficult to annotate reliably, suggesting the practicality of backing off from their annotation to reduce noise for corpusbased studies. Our resulting guidelines follow the PDTB adjacency constraint for implicits while employing an underspecified representation of non-adjacent implicits, and yield 62% inter-annotator agreement on this task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards an Annotated Corpus of Discourse Relations in Hindi

We describe our initial efforts towards developing a large-scale corpus of Hindi texts annotated with discourse relations. Adopting the lexically grounded approach of the Penn Discourse Treebank (PDTB), we present a preliminary analysis of discourse connectives in a small corpus. We describe how discourse connectives are represented in the sentence-level dependency annotation in Hindi, and disc...

متن کامل

Experiments with Annotating Discourse Relations in the Hindi Discourse Relation Bank

In the Hindi Discourse Relation Bank (HDRB) project, we are developing a large corpus annotated with discourse relations, such as causal, temporal, contrastive and conjunctive relations. Adopting the lexically grounded approach of the Penn Discourse Treebank (PDTB), we annotate the argument structure of both explicit and implicit discourse relations, as well as the senses of relations. We descr...

متن کامل

The Penn Discourse Treebank 2.0 Annotation Manual

This report contains the guidelines for the annotation of discourse relations in the Penn Discourse Treebank (http://www.seas.upenn.edu/~pdtb), PDTB. Discourse relations in the PDTB are annotated in a bottom up fashion, and capture both lexically realized relations as well as implicit relations. Guidelines in this report are provided for all aspects of the annotation, including annotation expli...

متن کامل

Inferring Discourse Relations from PDTB-style Discourse Labels for Argumentative Revision Classification

Penn Discourse Treebank (PDTB)-style annotation focuses on labeling local discourse relations between text spans and typically ignores larger discourse contexts. In this paper we propose two approaches to infer discourse relations in a paragraph-level context from annotated PDTB labels. We investigate the utility of inferring such discourse information using the task of revision classification....

متن کامل

Implicit Discourse Relation Classification via Multi-Task Neural Networks

Without discourse connectives, classifying implicit discourse relations is a challenging task and a bottleneck for building a practical discourse parser. Previous research usually makes use of one kind of discourse framework such as PDTB or RST to improve the classification performance on discourse relations. Actually, under different discourse annotation frameworks, there exist multiple corpor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017